64 research outputs found

    Syntactically Look-Ahead Attention Network for Sentence Compression

    Full text link
    Sentence compression is the task of compressing a long sentence into a short one by deleting redundant words. In sequence-to-sequence (Seq2Seq) based models, the decoder unidirectionally decides to retain or delete words. Thus, it cannot usually explicitly capture the relationships between decoded words and unseen words that will be decoded in the future time steps. Therefore, to avoid generating ungrammatical sentences, the decoder sometimes drops important words in compressing sentences. To solve this problem, we propose a novel Seq2Seq model, syntactically look-ahead attention network (SLAHAN), that can generate informative summaries by explicitly tracking both dependency parent and child words during decoding and capturing important words that will be decoded in the future. The results of the automatic evaluation on the Google sentence compression dataset showed that SLAHAN achieved the best kept-token-based-F1, ROUGE-1, ROUGE-2 and ROUGE-L scores of 85.5, 79.3, 71.3 and 79.1, respectively. SLAHAN also improved the summarization performance on longer sentences. Furthermore, in the human evaluation, SLAHAN improved informativeness without losing readability.Comment: AAAI 202

    Table and Image Generation for Investigating Knowledge of Entities in Pre-trained Vision and Language Models

    Full text link
    In this paper, we propose a table and image generation task to verify how the knowledge about entities acquired from natural language is retained in Vision & Language (V & L) models. This task consists of two parts: the first is to generate a table containing knowledge about an entity and its related image, and the second is to generate an image from an entity with a caption and a table containing related knowledge of the entity. In both tasks, the model must know the entities used to perform the generation properly. We created the Wikipedia Table and Image Generation (WikiTIG) dataset from about 200,000 infoboxes in English Wikipedia articles to perform the proposed tasks. We evaluated the performance on the tasks with respect to the above research question using the V & L model OFA, which has achieved state-of-the-art results in multiple tasks. Experimental results show that OFA forgets part of its entity knowledge by pre-training as a complement to improve the performance of image related tasks.Comment: Accepted at ACL 202

    Model-based Subsampling for Knowledge Graph Completion

    Full text link
    Subsampling is effective in Knowledge Graph Embedding (KGE) for reducing overfitting caused by the sparsity in Knowledge Graph (KG) datasets. However, current subsampling approaches consider only frequencies of queries that consist of entities and their relations. Thus, the existing subsampling potentially underestimates the appearance probabilities of infrequent queries even if the frequencies of their entities or relations are high. To address this problem, we propose Model-based Subsampling (MBS) and Mixed Subsampling (MIX) to estimate their appearance probabilities through predictions of KGE models. Evaluation results on datasets FB15k-237, WN18RR, and YAGO3-10 showed that our proposed subsampling methods actually improved the KG completion performances for popular KGE models, RotatE, TransE, HAKE, ComplEx, and DistMult.Comment: Accepted by AACL 2023; 9 pages, 3 figures, 5 table

    Does Pre-trained Language Model Actually Infer Unseen Links in Knowledge Graph Completion?

    Full text link
    Knowledge graphs (KGs) consist of links that describe relationships between entities. Due to the difficulty of manually enumerating all relationships between entities, automatically completing them is essential for KGs. Knowledge Graph Completion (KGC) is a task that infers unseen relationships between entities in a KG. Traditional embedding-based KGC methods, such as RESCAL, TransE, DistMult, ComplEx, RotatE, HAKE, HousE, etc., infer missing links using only the knowledge from training data. In contrast, the recent Pre-trained Language Model (PLM)-based KGC utilizes knowledge obtained during pre-training. Therefore, PLM-based KGC can estimate missing links between entities by reusing memorized knowledge from pre-training without inference. This approach is problematic because building KGC models aims to infer unseen links between entities. However, conventional evaluations in KGC do not consider inference and memorization abilities separately. Thus, a PLM-based KGC method, which achieves high performance in current KGC evaluations, may be ineffective in practical applications. To address this issue, we analyze whether PLM-based KGC methods make inferences or merely access memorized knowledge. For this purpose, we propose a method for constructing synthetic datasets specified in this analysis and conclude that PLMs acquire the inference abilities required for KGC through pre-training, even though the performance improvements mostly come from textual information of entities and relations.Comment: 15 pages, 10 figure

    Comprehensive Analysis of Negative Sampling in Knowledge Graph Representation Learning

    Full text link
    Negative sampling (NS) loss plays an important role in learning knowledge graph embedding (KGE) to handle a huge number of entities. However, the performance of KGE degrades without hyperparameters such as the margin term and number of negative samples in NS loss being appropriately selected. Currently, empirical hyperparameter tuning addresses this problem at the cost of computational time. To solve this problem, we theoretically analyzed NS loss to assist hyperparameter tuning and understand the better use of the NS loss in KGE learning. Our theoretical analysis showed that scoring methods with restricted value ranges, such as TransE and RotatE, require appropriate adjustment of the margin term or the number of negative samples different from those without restricted value ranges, such as RESCAL, ComplEx, and DistMult. We also propose subsampling methods specialized for the NS loss in KGE studied from a theoretical aspect. Our empirical analysis on the FB15k-237, WN18RR, and YAGO3-10 datasets showed that the results of actually trained models agree with our theoretical findings.Comment: Accepted at ICML202
    corecore